Search CORE

39 research outputs found

Disentangled Feature Learning for Real-Time Neural Speech Coding

Author: Jiang Xue
Lu Yan
Peng Xiulian
Zhang Yuan
Publication venue
Publication date: 21/11/2022
Field of study

Recently end-to-end neural audio/speech coding has shown its great potential to outperform traditional signal analysis based audio codecs. This is mostly achieved by following the VQ-VAE paradigm where blind features are learned, vector-quantized and coded. In this paper, instead of blind end-to-end learning, we propose to learn disentangled features for real-time neural speech coding. Specifically, more global-like speaker identity and local content features are learned with disentanglement to represent speech. Such a compact feature decomposition not only achieves better coding efficiency by exploiting bit allocation among different features but also provides the flexibility to do audio editing in embedding space, such as voice conversion in real-time communications. Both subjective and objective results demonstrate its coding efficiency and we find that the learned disentangled features show comparable performance on any-to-any voice conversion with modern self-supervised speech representation learning models with far less parameters and low latency, showing the potential of our neural coding framework.Comment: Submitted to ICASSP202

arXiv.org e-Print Archive

Interactive Speech and Noise Modeling for Speech Enhancement

Author: Lu Yan
Peng Xiulian
Srinivasan Sriram
Zhang Yuan
Zheng Chengyu
Publication venue
Publication date: 14/04/2021
Field of study

Speech enhancement is challenging because of the diversity of background noise types. Most of the existing methods are focused on modelling the speech rather than the noise. In this paper, we propose a novel idea to model speech and noise simultaneously in a two-branch convolutional neural network, namely SN-Net. In SN-Net, the two branches predict speech and noise, respectively. Instead of information fusion only at the final output layer, interaction modules are introduced at several intermediate feature domains between the two branches to benefit each other. Such an interaction can leverage features learned from one branch to counteract the undesired part and restore the missing component of the other and thus enhance their discrimination capabilities. We also design a feature extraction module, namely residual-convolution-and-attention (RA), to capture the correlations along temporal and frequency dimensions for both the speech and the noises. Evaluations on public datasets show that the interaction module plays a key role in simultaneous modeling and the SN-Net outperforms the state-of-the-art by a large margin on various evaluation metrics. The proposed SN-Net also shows superior performance for speaker separation.Comment: AAAI 2021 (Accepted

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

DasFormer: Deep Alternating Spectrogram Transformer for Multi/Single-Channel Speech Separation

Author: Kong Xiangyu
Lu Yan
Movassagh Mahmood
Peng Xiulian
Prakash Vinod
Wang Shuo
Publication venue
Publication date: 14/03/2023
Field of study

For the task of speech separation, previous study usually treats multi-channel and single-channel scenarios as two research tracks with specialized solutions developed respectively. Instead, we propose a simple and unified architecture - DasFormer (Deep alternating spectrogram transFormer) to handle both of them in the challenging reverberant environments. Unlike frame-wise sequence modeling, each TF-bin in the spectrogram is assigned with an embedding encoding spectral and spatial information. With such input, DasFormer is then formed by multiple repetition of simple blocks each of which integrates 1) two multi-head self-attention (MHSA) modules alternately processing within each frequency bin & temporal frame of the spectrogram 2) MBConv before each MHSA for modeling local features on the spectrogram. Experiments show that DasFormer has a powerful ability to model the time-frequency representation, whose performance far exceeds the current SOTA models in multi-channel speech separation, and also achieves single-channel SOTA in the more challenging yet realistic reverberation scenario.Comment: 5 pages, accepted by ICASSP202

arXiv.org e-Print Archive

Notch1 is required for hypoxia-induced proliferation, invasion and chemoresistance of T-cell acute lymphoblastic leukemia cells

Author: Dai Jianjian
Ji Chunyan
Li Peng
Liu Na
Lu Fei
Ma Daoxin
Park Jino
Qu Xun
Sun Xiulian
Ye Jingjing
Zou Jie
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2013
Field of study

Background Notch1 is a potent regulator known to play an oncogenic role in many malignancies including T-cell acute lymphoblastic leukemia (T-ALL). Tumor hypoxia and increased hypoxia-inducible factor-1α (HIF-1α) activity can act as major stimuli for tumor aggressiveness and progression. Although hypoxia-mediated activation of the Notch1 pathway plays an important role in tumor cell survival and invasiveness, the interaction between HIF-1α and Notch1 has not yet been identified in T-ALL. This study was designed to investigate whether hypoxia activates Notch1 signalling through HIF-1α stabilization and to determine the contribution of hypoxia and HIF-1α to proliferation, invasion and chemoresistance in T-ALL. Methods T-ALL cell lines (Jurkat, Sup-T1) transfected with HIF-1α or Notch1 small interference RNA (siRNA) were incubated in normoxic or hypoxic conditions. Their potential for proliferation and invasion was measured by WST-8 and transwell assays. Flow cytometry was used to detect apoptosis and assess cell cycle regulation. Expression and regulation of components of the HIF-1α and Notch1 pathways and of genes related to proliferation, invasion and apoptosis were assessed by quantitative real-time PCR or Western blot. Results Hypoxia potentiated Notch1 signalling via stabilization and activation of the transcription factor HIF-1α. Hypoxia/HIF-1α-activated Notch1 signalling altered expression of cell cycle regulatory proteins and accelerated cell proliferation. Hypoxia-induced Notch1 activation increased the expression of matrix metalloproteinase-2 (MMP2) and MMP9, which increased invasiveness. Of greater clinical significance, knockdown of Notch1 prevented the protective effect of hypoxia/HIF-1α against dexamethasone-induced apoptosis. This sensitization correlated with losing the effect of hypoxia/HIF-1α on Bcl-2 and Bcl-xL expression. Conclusions Notch1 signalling is required for hypoxia/HIF-1α-induced proliferation, invasion and chemoresistance in T-ALL. Pharmacological inhibitors of HIF-1α or Notch1 signalling may be attractive interventions for T-ALL treatment

Springer - Publisher Connector

PubMed Central

The Research Repository @ WVU (West Virginia University)